NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

On logistic regression and maximum entropy approaches

Santhanam, Narayana; Zhang, Yixin (June 2025, IEEE Symposium on Information Theory)

Logistic Regression is a widely used generalized linear model applied in classification settings to assign probabilities to class labels. It is also well known that logistic regression is a maximum entropy procedure subject to what are sometimes called the balance conditions. The dominant view in existing explanations are all discriminative, i.e., modeling labels given the data. This paper adds to the maximum entropy interpretation, establishing a generative, maximum entropy explanation for the commonly used logistic regression training and optimization procedures. We show that logistic regression models the conditional distribution on the instance space given class labels with a maximum entropy model subject to a first moment constraint on the training data, and that the commonly used fitting procedure would be a Monte-Carlo fit for the generative view.
more » « less
Free, publicly-accessible full text available June 27, 2026
A Novel Multidisciplinary Graduate Education Program in Data Science

Kuh, Anthony; Santhanam, Narayana; Port, Daniel; Thompson, London; Lee, Mary (July 2025, nternational Joint Conference on Neural Networks)

There has been an explosion of growth in using AI, data science, and machine learning in all aspects of our daily life. There is a global competition among governments, industry, and academic institutions to lead research and development in this area. This paper discusses a novel multidisciplinary graduate education and research program at our institution to help develop a trained workforce to meet the demands required to understand and develop AI, data science and machine learning technologies. The program brings together faculty and students in engineering, computer science, and social science to build a traineeship program where cohort teams study fundamental and applied data science research, using compact modules across courses to personalize instruction and prepare each trainee with skills tailored to their prior experience and future career goals.
more » « less
Free, publicly-accessible full text available July 7, 2026
A Bound for Learning Lossless Source Coding with Online Learning

Host-Madsen, Anders; Zaeri_Amirani, Mohammad; Santhanam, Narayana P (November 2024, International Symposium on Information Theory and its Applications (ISITA))

This paper develops bounds for learning lossless source coding under the PAC (probably approximately correct) framework. The paper considers iid sources with online learning: first the coder learns the data structure from training sequences. When presented with a test sequence for compression, it continues to learn from/adapt to the test sequence. The results show, not unsurprisingly, that there is little gain from online learning when the training sequence length is much longer than the test sequence length. But if the test sequence length is longer than the training sequence, there is a significant gain. Coders for online learning has a somewhat surprising structure: the training sequence is used to estimate a confidence interval for the distribution, and the coding distribution is found through a prior distribution over this interval.
more » « less
Full Text Available
A Bound for Learning Lossless Source Coding with Online Learning

https://doi.org/10.23919/ISITA60732.2024.10858204

Høst-Madsen, Anders; Amirani, Mohammad Zaeri; Santhanam, Narayana Prasad (November 2024, IEEE)

Full Text Available
Universal Compression of High Dimensional Gaussian Vectors with James-Stein shrinkage

https://doi.org/10.1109/ISIT54713.2023.10206798

Santhanam, Narayana Prasad; Bakshi, Mayank (June 2023, IEEE)

Full Text Available
Data-derived weak universal consistency

Santhanam, Narayana; Anantharam, Venkat; Szpankowski, Wojciech (February 2022, Journal of machine learning research)

Full Text Available
Almost Uniform Sampling From Neural Networks

https://doi.org/10.1109/CISS48834.2020.1570617124

Wu, Changlong; Santhanam, Narayana Prasad (March 2020, IEEE)

Given a length n sample from R^d and a neural network with a fixed architecture with W weights, k neurons, linear threshold activation functions, and binary outputs on each neuron, we study the problem of uniformly sampling from all possible labelings on the sample corresponding to different choices of weights. We provide an algorithm that runs in time polynomial both in n and W such that any labeling appears with probability at least (W2ekn)^W for W < n. For a single neuron, we also provide a random walk based algorithm that samples exactly uniformly.
more » « less
Full Text Available
Redundancy of Unbounded Memory Markov Classes with Continuity Conditions

https://doi.org/10.1109/ISIT.2018.8437897

Wu, Changlong; Hosseini, Maryam; Santhanam, Narayana (June 2018, IEEE)

Full version with complete proofs attached. Please email NS (email in paper) for other versions if desired.
more » « less
Full Text Available

Search for: All records